AI and Machine Learning by Rahman Was;

AI and Machine Learning by Rahman Was;

Author:Rahman, Was;
Language: eng
Format: epub
Publisher: SAGE Publications
Published: 2020-07-24T12:43:55+00:00


Deep Learning

Deep learning56 is a family of ML techniques that help find more accurate and sophisticated answers to ML questions. It can be used for supervised, unsupervised or semi-supervised models.

The key idea behind deep learning is breaking down the learning process into a series of steps and representing each learning step as a connected ‘layer’ of processing. Each layer works on a different piece of the overall problem and makes its answer available to the other layers. The overall result of the whole activity is obtained by combining the different answers from the different layers.

These layers are usually illustrated as physical layers in a diagram, but you should be clear that this is not a literal picture, and the layers are computer programs. Each layer is a set of rules and instructions to perform calculations on data, and the result of those calculations is the output of the layer. The reason it’s called ‘deep’ learning is because there can be many layers involved. Eight to ten is common.

We can return to OCR for a deliberately over-simplified, hypothetical illustration of deep learning. OCR is done using deep learning, but the way it’s done in practice today is more complex than the form described below.

In technical terms, AI performs OCR by detecting the text in an image (i.e. distinguishing between text, images, decorative elements like borders and other items like smudges), then identifying what it’s detected (i.e. recognizing letters and words). We’ll focus on the first piece, detecting the text.

As with all such examples, it’s built on the idea that ML involves a set of rules and instructions to achieve a result, that it uses feedback to improve those rules and instructions, and that deep learning consists of several layers, each of which performs a small step of the overall ML activity.

So, to use deep learning to detect text, we start by breaking that activity down into small steps, in this case three, and use a separate deep learning layer to perform each step. The first layer (remember, a computer program) examines all the dark parts of the image (i.e. data representing the printed text and any other marks on the page), and determines where the boundaries are, using appropriate rules and instructions created when it was designed. It then needs to tell the next step (layer) where those boundaries are. To do this, it needs to represent that information in a form that the next layer’s computer program can receive and process. As we know, this representation is done using the language of maths.

So, the output of the first layer is a set of mathematical data that is the input to the next layer, describing the locations of all pieces of darkness in the image, including boundaries.

The purpose of the second layer is to recognize the shapes of the dark pieces that the first layer identified. It does this using a different set of mathematical rules and instructions, and represents the answer using different mathematical language. The output from this layer



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.